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ABSTRACT 

We present a direct comparison of the clustering properties of two redshift sur- 
veys covering a common volume of space: the recently completed IRAS Point Source 
Catalogue redshift survey (PSCz) containing 14500 galaxies with a limiting flux of 
0.6 Jy at 60 fim, and the optical Stromlo-APM survey containing 1787 galaxies in 
a region of 4300 deg 2 in the south Galactic cap. We use three methods to compare 
the clustering properties: the counts-in-cells comparison of Efstathiou (1995, hereafter 
E95), the two-point cross correlation function, and the Tegmark (1998) 'null-buster' 
test. We find that the Stromlo variances are systematically higher than those of PSCz, 
as expected due to the deficit of early-type galaxies in IRAS samples. However we find 
that the differences between the cell counts are consistent with a linear bias between 
the two surveys, with a relative bias parameter 6 rc i = 6stromio/&PSCz ~ 1-3 which ap- 
pears approximately scale-independent. The correlation coefficient R between optical 
and IRAS densities on scales ~ 20 ft, _1 Mpc is R > 0.72 at 95% c.L, placing limits on 
types of 'stochastic bias' which affect optical and IRAS galaxies differently. 
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1 INTRODUCTION 

Observations of galaxy clustering are a very valuable 
probe of cosmology, since they can provide informa- 
tion both on the shape of the initial density fluctua- 
tions and on the cosmological parameters such as f2o, 
the composition of the dark matter etc. When combined 
with related observables such as CMB anisotropy and 
galaxy peculiar motions, it should in future be possible 
to perform consistency checks since there are 3 observ- 
able functions (the CMB spherical harmonics and the 
power spectra of density and velocity fields) predicted 
by one input function (the initial power spectrum) and 
a handful of cosmological parameters. 

Following the pioneering studies by Peebles and 
coworkers in the 1970s, the field has undergone rapid 
growth, with many large galaxy surveys now in exis- 



tence, notably the 2-dimensional APM Galaxy Survey 
(Maddox et al. 1990), the CfA-2 redshift survey (e.g. 
Vogeley et al. 1992), the Las Campanas redshift sur- 
vey (Shectman et al. 1996) and the IRAS PSCz survey 
(Saunders et al. 1997). 

The galaxy surveys obviously measure the distri- 
bution of luminous matter, which probably does not 
exactly trace the total mass distribution which is dom- 
inated by dark matter. For instance, it was shown by 
Kaiser (1984) that if galaxies form at peaks of a Gaus- 
sian random field, they will be more strongly clustered 
than the underlying mass, dubbed biasing. Thus, un- 
derstanding the relationship between galaxies and mass 
is important both for estimation of cosmological pa- 
rameters and for probing the physics of galaxy forma- 
tion. A common assumption is 'linear biasing' given by 
5 g = bS m where 5 g ,S m are the fractional overdensities 
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relative to the mean in galaxies and mass respectively, 
and b is a constant 'bias parameter'; this assumption 
must be inaccurate on small scales for b > 1 since 
5 g > — 1. However, we may still define a bias param- 
eter b{r) by e.g. £ gg (r) = b(r) 2 £ mm (r), and if the bi- 
asing is "local" in the sense that the galaxy density is 
a function only of the mass density smoothed on small 
scales, it can be shown that b(r) must approach a con- 
stant on large scales (e.g. Coles 1993, Pen 1998). A de- 
tailed account of the statistics of biasing including pos- 
sible non-linearities and stochastic variations, and the 
relationships between the several possible definitions of 
b, is given by Dekel & Lahav (1998). Recently, there 
have been a number of predictions of bias assuming 
that galaxies correspond to dark matter halos, e.g. the 
analytic model of Mo & White (1996), which is com- 
pared with N-body simulations by e.g. Jing (1998) and 
Kravtsov & Klypin (1998). The bias factor assuming 
various simple prescriptions for the morphology-density 
relation is investigated by Narayanan et al. (1998). 

Surveys of galaxy peculiar motions can directly map 
the mass distribution in the local universe, but the error 
bars per galaxy are large and so the method is limited 
to rather local volumes with heavy smoothing. Since 
it is difficult to measure the mass distribution directly, 
another useful probe is to compare the clustering prop- 
erties of galaxy surveys with different selection crite- 
ria: if these cluster differently, at least one class cannot 
exactly follow the mass distribution. However, if both 
classes obey the linear bias relation with different values 
of 6, there is clearly a linear relation between the two 
galaxy density fields and thus the cross correlation func- 
tion £i2(r) — y £,i(r)£,2(r) where £i,£2 are the usual au- 
tocorrelation functions for the two galaxy classes. Thus, 
if this relation does not hold, then the linear-bias model 
must fail for at least one of the galaxy classes. 

It is well known (e.g. Dressier 1980; Guzzo et al. 
1997) that the fraction of elliptical galaxies increases 
with local galaxy density, and it is also known that 
IRAS galaxies (selected at 60^m) have a lower cor- 
relation amplitude on small scales than optically se- 
lected galaxies (e.g. Saunders et al. 1992; Loveday et 
al. 1995). It is also known that the density fields of op- 
tical and IRAS galaxies are in quite good qualitative 
agreement, i.e. the prominent nearby structures such as 
Virgo, Perseus-Pisces, Hydra-Centaurus are common to 
both types of survey, but quantitative comparisons so 
far have been rather limited in volume since the differing 
depths and sky coverages cause limited overlap between 
the different surveys (e.g. Oliver et al. 1996). Probably 
for this reason, there has been considerable scatter in 
previous estimates of the relative bias bojbi (e.g. Lahav, 
Nemiroff & Piran 1992). A comparison of velocity fields 
predicted from IRAS 1.2 Jy and ORS surveys is given 
by Baker et al. (1998), who find very good agreement 
between the two velocity fields for a relative bias factor 
b T ei = bo/bi 1.4. In this paper we provide a compari- 
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Figure 1: An Aitoff projection of the PSCz and Stromlo- 
APM regions in equatorial coordinates. The PSCz mask 
is indicated by the dark shading. The light shaded area 
in the southern hemisphere is the Stromlo survey region. 
An example of one of the shells of cells is also shown. 

son of density fields over a larger volume of space than 
previously available, using the newly-completed IRAS 
PSCz redshift survey and the sparse-sampled Stromlo- 
APM optical survey. 

The plan of the paper is as follows: we summarise 
details of the two surveys in §2, and then compare 
the clustering with three different statistical methods: 
a counts-in-cells comparison in §3, two-point correla- 
tion functions in §4 and the 'modified % 2 ' statistic of 
Tegmark (1998) in §5. We summarise the conclusions 
in §6. 

2 THE PSCZ AND STROMLO- APM 
REDSHIFT SURVEYS 

The PSCz survey is described in detail by Saunders et 
al. (1997). The aim of the survey was to obtain redshifts 
for all IRAS galaxies with 60^m fluxes greater than 
0.6 Jy over as much of the sky as possible. The parent 
catalogue is based on the QMW IRAS Galaxy Cata- 
logue (Rowan-Robinson et al. 1990) but with a number 
of modifications designed to increase sky coverage and 
improve completeness. The final 2-D catalogue contains 
17060 objects, 1593 of which were rejected as sources 
in our Galaxy or multiple entries from resolved nearby 
galaxies. Another 648 were rejected as faint galaxies or 
no optical identification leaving 14819 in the target list. 
Redshifts are now known for 14539 (95%) of these. The 
PSCz mask is shown in figure ^ as the dark shaded re- 
gion. It includes regions of low galactic latitude, some 
odd patches of galactic cirrus and the two strips of eclip- 
tic longitude not surveyed by IRAS. 

The Stromlo-APM survey (Loveday et al. 1992) 
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is an optically selected redshift survey in a region 
of the southern hemisphere approximately 21 h < 
a <, 5 h ,-72.5° <, 5 <S -17.5°. The survey consists of 
1787 galaxies of magnitude bj < 17.15 selected at a rate 
of 1 in 20 from the APM galaxy catalogue. The Stromlo- 
APM region is shown by the light shading in figure [l] 
The small holes in the survey are mainly due to bright 
foreground stars. Figure || shows 'cone plots' of galaxies 
in declination slices for the two catalogues in the APM 
region. Note that the selection functions are somewhat 
different, with dN/dz peaking at ~ 70 /i~ 1 Mpc for PSCz 
and ~ 150 ft, _1 Mpc for Stromlo, and the shot noise is ap- 
preciable in both surveys due to the relatively low space 
density of IRAS galaxies and the l-in-20 sampling for 
Stromlo. However, there is quite good qualitative simi- 
larity between the two distributions, e.g. the prominent 
'wall' near a ~ 21 and the squarish void centred on 

5 40°, cz - 7000 km s" 1 . 

In the following analysis we treat the two galaxy 
surveys as independent samples of the same region of 
the universe. This assumption would not be valid if 
there were an appreciable number of galaxies common 
to both catalogues. We have found 41 such galaxies and 
these have not been included in any of the subsequent 
analysis. 



2.1 Mock PSCz and Stromlo-APM catalogues 

Throughout this paper we used a suite of mock PSCz 
and Stromlo-APM surveys to estimate the uncertainty 
in the various statistics calculated. There are 27 pairs 
of mock catalogues in total taken from N-body simula- 
tions of standard ilo = 1 CDM universes. They have the 
same survey geometry, selection function and sampling 
rate as their real counterparts. Each pair of Stromlo and 
PSCz catalogues was sampled from a common region of 
space in the same initial simulation so that underlying 
density fluctuations in the PSCz mock catalogue are 
mirrored in the corresponding Stromlo catalogue. We 
would therefore expect to see the clustering in each cat- 
alogue of the pair to differ only due to the shot noise. 
This property is useful when calculating the uncertainty 
in the cross correlation function and the ratio between 
the two individual correlation functions. Mass points in 
the simulations are interpreted as galaxies so the cata- 
logues do not include bias. 



3 COUNTS IN CELLS COMPARISON 

The counts in cells comparison method is discussed in 
detail in E95. The surveyed region of space is divided 
into spherical shells of thickness I centred on the ob- 
server. Each shell is then subdivided into N c approx- 
imately cubic cells each of volume V = £ 3 . For one 
catalogue, we represent the galaxy counts in the ith 
cell by Ni. The expected mean cell count is denoted by 



(Ni) = A. The variance of the counts in cells in excess 
of Poisson variance is given by 



Si 



(Ni - N) 2 - N. 



(1) 



where N is the mean cell count. The expectation value 
of |) is 



(Si) = \ 2 o\, 



(2) 



where a\ is the variance of the underlying density field 
on the scale of £ 3 . This is equal to a volume integral of 
the autocorrelation function, £(ri2), over the cell of size 



77^ / / t(ri2)dVidV 2 . 



V 2 



v=e 3 



(3) 



When the underlying density fluctuations are Gaussian 
the variance of Si is given by 

Var(Si) = t^[2A 2 (1 + a\) + 4\ 3 a 2 + 2X A af}. (4) 

The statistics Si and S 2 can be found for the two 
redshift surveys (the subscripts now refer to the two in- 
dividual surveys) and used to examine the comparative 
clustering. If the two surveys occupy a common region of 
space then the comparison can be taken a stage further 
by computing the covariance, S12, of the cell counts: 



Si 



1 



(Nc - 1) V 



(Ni-N)(Mi-M) 



(5) 



where Ni and Mi now represent the counts from surveys 
1 and 2 respectively. The underlying covariance a 2 2 is 
defined equivalently to a\ so that 



(S 



12 



'12 



Xfj,a 12 , where 

V 2 I lv e J 12 ^ Tl2)dVldV2 



(6) 



where /1 = (Mi), the expected cell mean cell count 
from survey two, and £12 (r) is the usual two-point cross- 
correlation function. 

To calculate the uncertainty in the statistics Si , S2 
and S12 we must first assume a model (e.g. Gaussian) 
for the clustering. The need for such a model can be 
avoided if we consider only the differences between the 
counts from surveys 1 and 2 in each cell. The normalised 
mean square difference between the cell counts is 



Sa = 



(N c - 1) 



M t N) 2 - (M + N)MN. (7) 



If the galaxy distributions in the two catalogues are 
Poisson point processes with identical statistical prop- 
erties then the statistic Sd will be solely determined by 
Poisson statistics and its variance will be given by 



Var(Sd) ~ A/i 



N c 



(N c 



[A^+^+4AV+2(A>+A/0].(8) 
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-50.0 < 6 < -40.0 -40.0 < <5 < -30.0 



Figure 2: The locations of galaxies in redshift space at slices of declination within the APM region. PSCz galaxies 
are represented by black spots and Stromlo-APM galaxies by stars. 
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This value is independent of the underlying nature of 
the galaxy density field. The statistic Sd is related to 
Si, S 2 and 6*12 by 



S d = M 2 Si + N 2 S 2 - 2MNS12 
and its expectation value is therefore 



(s d > = WK 



2ai 



(9) 



(10) 



This provides an simple way of determining whether 
the clustering properties of the two surveys are con- 
sistent while avoiding making any special assumptions 
about the nature of the clustering. If the two catalogues 
sample the same density field then the three correlation 
functions £1, £2 and £12 will be identical and we see 
from equations (||) and (Q) that so too are a\, a\ and 
a\ 2 ■ The expectation value of Sd given in equation JTc| ) 
would therefore be zero within the limits of the sam- 
pling error given by equation (|J). 

We also investigate the possibility that the galax- 
ies in the two catalogues are biased tracers of the same 
underlying density field. In particular we look at the 
linear bias model (Sp/p) ga \ = b(Sp/p) m where the two 
overdensity fields are perfectly correlated. If 61 and b 2 
are the bias parameters of the two catalogues with re- 
spect to the underlying matter distribution then the 
linear bias model implies 



'12 



h 2 ^ 2 

= Oi^m: 

h 2 ^ 2 

= ¥m. 

= bib 2 al 



(11) 

where cr^ is the variance of the underlying density field. 
Hence we can estimate the relative bias bi/b 2 . The 
above clearly implies 



2 

°12 



(T1CT2 



(12) 



which can be used to test for consistency with linear 
bias. 

The above equations for the Si's apply to a single 
spherical shell with constant mean densities A,/i. For 
the full surveys, we now consider a series of A s h e ii spher- 
ical shells, each of thickness I, centred on the observer, 
subdivided into cubical cells as before. The statistics 
Si, Sf and S^ are now computed separately for each 
shell k. 

When the number of cells, N c , is large, the central 
limit theorem states that the probability distribution of 
the Si , S 2 and S* will tend to a multivariate Gaussian 



pists^s^dsfdsidSd" 



k jot 



where 



cxp(A) dSfdS$dS* 
(27r) 3 / 2 (detV) 1 /2 ' 1 > 



(14) 



and Vn 



is 



the covariance matrix Vu 



Cov(Sf,S fc ) = 1,2, d). Expressions for the el- 



ements of this matrix can be found in the appendix of 
E95. We form the combined likelihood over all shells, 



JV., 



£oc J] PiSlSlSl) 



(15) 



fc=i 



which we maximise with respect to a\, a 2 and a 2 2 . This 
method automatically assigns a weight to each shell. 
Nearby shells contribute little weight because the num- 
ber of cells is small and distant shells contribute little 
information because the statistics become dominated by 
shot noise. 

We use lines of constant right ascension and decli- 
nation as the cell boundaries to match the geometry of 
the Stromlo-APM survey region. An example of one of 
the concentric shells of cells is shown in figure [J. Each 
cell at right ascension a and declination S subtends an 
angle AS in declination and has a right ascension range 
Aa such that Aa = AS/ cos S. The angle AS is chosen 
at each radial distance to ensure that the sides of the 
cell are of length I, the shell separation. 

We apply a joint PSCz-Stromlo mask to both sur- 
veys so that Stromlo galaxies within the PSCz mask are 
not included and vice versa. A number of the cells are 
partially or totally covered by the mask. We correct for 
this by replacing equation (Q) with the expression 

Sf = A/B where (16) 
2 



.4 




V, 



(E k v k y 



E 

3 



B = 



E 

. k 



v 2 



E k v k 3 , (£, 



v k 2 ? 



E k v k (E k v k ) 2 



(Efstathiou et al. (1990)) where the sum extends over 
all cells in the i th radial shell and Vj is the volume of 
cell j not excluded by the mask. This expression is only 
valid when a small fraction of each cell is excluded by 
the mask. Cells that are greater than 30% covered by 
the joint mask were therefore not used in the analysis. 
To calculate the statistic Sd and the mean cell counts 
and M we artificially increase the galaxy counts in 
partially filled cells to account for the fraction excluded 
by the mask. 

If we wish to use equation ([l3]) to calculate the like- 
lihood then there must be enough cells in a given radial 
shell for the central limit theorem to be applicable. Any 
shells that do not contain enough cells for this to be the 
case must be excluded from the analysis. We have only 
included shells with 20 or more filled (< 30% masked) 
cells. The nearest shells are therefore rejected. Cells be- 



6 M.D. Seaborne et al. 



0.5 - 



■ PSCz 
* Stromlo 



0.1 - 



10 20 

I (rT 1 Mpc) 

Figure 3: a 2 as a function of cell size I for the PSCz and 
Stromlo surveys. The error bars are 1 standard devia- 
tion. 

yond a radial distance of 250/i _1 Mpc are not included 
in the analysis as they contain very few galaxies and so 
make a negligible contribution to the final result. The 
PSCz survey is only complete out to this distance for 
|6| > 10°. We therefore mask this region from the PSCz 
(this has no effect on the 'combined' mask since the 
Stromlo survey is at 6 — 40°). 

Before doing the cell by cell comparison described 
above we first measure the values of the cell count vari- 
ances calculated using the whole of each survey; these 
are shown as a function of cell size in Figure ||. We 
attempt to minimise the loss of information that arises 
from binning the data into cells by shifting the entire 
grid of cells and recalculating the likelihood. We shift 
the grid by either or t/2 from the 'default' value in 
combinations of right ascension, declination and radial 
velocity, thus giving eight separate (although not sta- 
tistically independent) estimates of the likelihood. 

For each of the 8 grid positions we form the indi- 
vidual one dimensional likelihood 



= n 



i 



fc=i 



27rVar(5j) 



exp 



iS\-N 2 a 2 ) 2 



2Var(S£) 



(17) 



We obtain the final estimates of a\ , u\ by computing the 
product of the eight likelihood functions and finding the 
maximum; this should provide a better estimate of the 
peak than using a single grid position alone. However, 
it is not possible to use the combined likelihood to es- 
timate the uncertainty in the values as the results from 
each grid position are not independent. We generate the 
error bars in Figure |3| by repeating the analysis on the 27 
mock PSCz and Stromlo catalogues, and measuring the 
standard deviation of the a 2 values estimated as above. 
From the simulations, combining the results from the 
eight grid positions gives a 20-40% gain in precision in 
the values of log a 2 over the use of a single set of cells. 
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Figure 4: A comparison of the counts in 20 /i _1 Mpc cells 
for PSCz and Stromlo galaxies. The top panel shows the 
values of a 2 for the two catalogues as a function of radial 
distance. These are obtained directly from the statistics 
Si and 5*2. The dashed lines show the respective max- 
imum likelihood values calculated with the assumption 
°12 = a \. a i- The lower panel shows the statistic Sd- The 
maximum likelihood value is calculated with the same 
assumption and using equation [l^. All error bars show 
one standard deviation. 

The gain in precision in general increases with cell size. 
The observed sum of the eight log£'s is then re-scaled 
so that the standard deviation of logo -2 matches that 
from the simulations. 

As expected, the variances decrease with increasing 
cell size as the galaxy distribution approaches homo- 
geneity on larger scales, and the Stromlo variances are 
consistently higher than those of PSCz. For cells of sizes 
20, 25 and 40 /i _1 Mpc the error bars do not overlap sug- 
gesting that the differences in clustering amplitude are 
highly significant. Only at 30/i _1 Mpc are the values 
comparable. 

Figure |] shows the values of a 2 , a 2 and the statistic 
Sd for each radial shell, for a cell size of 20 h~ 1 Mpc, us- 
ing Stromlo and PSCz galaxies within the APM region 
only. 

Again we see that the Stromlo variances are, in 
general, higher. The maximum likelihood value of the 
statistic <Sd lies slightly above zero and has a reduced x 2 
of 1.3 (when compared to the value zero) suggesting a 
slight difference in the clustering of galaxies in the two 
catalogues. 



3.1 Linear bias 

We next consider whether these clustering differences 
are in agreement with a linear relative bias g opt i C ai = 
frreiCiRAS- The likelihood function (equation (|15|)) is a 
function of the three parameters a 2 , o\ and tr 2 2 ; it is 
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Figure 5: The likelihood as a function of the correlation 
coefficient R. 



more convenient to change the third parameter to the 
correlation coefficient R — a 2 2 /<Ji<J2, since we must 
have — 1 < R < 1 in order that the covariance ma- 
trix in eq. [1^ is positive definite. Linear bias clearly 
implies R = 1. We then assume a uniform prior in the 
space of (log a\ , log of, R) and then integrate C over the 
logon's to obtain a one-dimensional likelihood function 
for R; this is shown in Figure || for cell sizes in the 
range 10 — 40 h~ 1 Mpc. Again, each value of the likeli- 
hood C(R) was calculated using eight different cell grid 
positions, and the values given are the geometric mean 
of these eight values. No attempt has been made to re- 
duce the width of the curves to account for the gain in 
precision obtained from the use of multiple grid posi- 
tions. 

For cells of size 25 and 40/i _1 Mpc the maximum 
likelihood value of R occurs at R = 1, consistent with 
linear bias. Values of R greater than unity are of course 
unphysical since this would imply that the correlation 
between the galaxy counts is better than perfect. How- 
ever, it is not unreasonable for the derivative of the like- 
lihood function to be positive at R = 1, if the Poisson 
differences in the real cell counts happens to be smaller 
than the 'ensemble average' value. For cells of size 10, 
15, 20 and 40/i" 1 Mpc we find that i? max = 0.96, 0.89, 
0.93 and 0.93 respectively, though in each case R = 1 is 
well within one standard deviation of this. We conclude 
that the galaxy cell counts are consistent with a linear 
bias model, and we derive 95% confidence lower limits 
of R > 0.83, 0.72, 0.55 for cells of size 10, 20, 30 h^Mpc 
respectively. 

Figure |6| shows contour maps of the joint likelihood 
assuming a linear bias, £(<j 2 ,a 2 ,R = 1), for cells of 
size 10 — 40 ft, _1 Mpc. Again we repeat the analysis us- 
ing the eight different grid positions, and compute the 
geometric mean of the individual £'s. This improves our 
estimate of the peak location but the contours do not 



then accurately reflect our improved knowledge of the 
two variances. Thus, we compute the value of a 2 for the 
27 pairs of mock catalogues, this time using only galax- 
ies that lie outside the joint mask. We finally rescale the 
mean log L so the width of the function matches that 
in the simulations. 

We approximate the shape of the likelihood func- 
tion to that produced by two Gaussian distributions 
such that the value of 21n(£/£ max ) has a chi-squared 
distribution with two degrees of freedom. This approx- 
imation was tested by directly determining the value 
of the contour that contains 95% of the likelihood en- 
closed within a large area of the cr\-a 2 (R = 1) plane. 
For each cell size examined the 95% contour (shown in 
bold in figure |^) accurately coincided with the contour 
21n(£/£ max ) = 5.99 which is the 5th percentile point 
of the x 2 distribution with two degrees of freedom. 

For cells of size 15 — 30ft. _1 Mpc we rule out the 
hypothesis of equal bias as the line a\ = a\ lies well 
outside the 95% confidence interval. The variance of 
the counts in cells for Stromlo-APM are clearly larger 
than those for PSCz. For cells of size 10 and 40 /i _1 Mpc 
a unit bias is marginally consistent with the larger er- 
rors, but a value > 1 is still favoured. All the cell sizes 
are consistent with a scale-independent relative bias of 

frrcl = &Stromlo/frpSCz ~ 1-4. 

Table [l] shows the relative bias for each cell size. The 
value b 2 el is given by the ratio of the maximum likeli- 
hood variances cr| tromlo /crp SCz . The 95% confidence in- 
terval was found by determining the two straight lines 



Stromlo 



kminOpSCz and ^ 



Stromlo 



max Cr PSCz 



on the 

contour plots between which 95% of the total likelihood 
lies. 



4 THE TWO-POINT AND 

CROSS-CORRELATION FUNCTIONS 

The linear bias model predicts that the individual two- 
point correlation functions, £i and £2, will be related to 
the cross-correlation function, £12, by 



£1 



(66) 1/2 



(18) 



The three correlation functions were found for the 
galaxies within the common survey region using the es- 
timators 



&»(r) 



(#ii?i> 2 
(D 1 D 2 )(R 1 R 2 ) 



1 



(19) 



(D 1 R 2 )(D 2 R 1 ) 

(Hamilton 1993) where the notation (DD), (RR) and 
(DR) refers to the (weighted) number of data-data, 
random-random and data-random pairs respectively in 
a narrow bin of separation r. The subscripts 1 and 
2 again correspond to the two surveys. The two ran- 
dom catalogues were generated within the joint sur- 
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Figure 6: Contour maps of the likelihood of variances of PSCz and Stromlo galaxy counts for a range of cell sizes. 
In each case we have assumed af 2 = <7i&2- The contours show where 21n(£/£ max ) is equal to 2.28, 5.99, 9.21 and 
13.81. These correspond to 68%, 95% (thick contour), 99% and 99.9% confidence intervals respectively, according to 
a chi-squared distribution with two degrees of freedom. The dotted line shows a\ = a\ 



Table 1: Relative bias as a function of cell size. 
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Figure 7: The Stromlo, PSCz and cross-correlation func- 
tions for the overlapping volume. The la error bars were 
generated using the 27 pairs of mock catalogues. 

vey volume, each with the correct corresponding ra- 
dial selection function. The selection functions were de- 
termined by integrating the published luminosity func- 
tions for the two surveys. For Stromlo-APM we use a 
Schechter function with input parameters a — —0.97, 
M* = -19.50 and cf>* = 1.4 x 10~ 2 h 3 Mpc- 3 (Loveday 
et al. 1992). For the IRAS luminosity function we use 
the parametric form 



0(L) 



(l-a) 



exp 



1 



L 



(20) 



with C = 2.6 ft, 3 Mpc~ 3 , a = 1.09, a = 0.724 and 
L* = 10 8 - 47 h- 2 L Q (Saunders et al. 1990). Each ran- 
dom catalogue contains 2 x 10 4 data points. 

To each galaxy or random in a pair we assign the 
weight 



1 r 

m = i Tl 1 u7T ' j3 ( r ) = / €( x ) dx > ( 21 ) 

1 + 47rn(r i )J 3 (r) J 

where r, is the radial distance to the galaxy. This 
weighting scheme can be shown to give the minimum 
uncertainty in £(r) on scales where £(r) <J 1. We de- 
termine the input quantity J3 using determinations of 
the IRAS and Stromlo correlation functions by previ- 
ous authors. We use power law fits of the form £(r) = 
( r / r o) r ■ Fisher et al. (1994) have measured the corre- 
lation function of the IRAS 1.2-Jy survey and find that 
their results are well described by a power law with 
r Q = 4.53 h- 1 Mpc and 7 = 1.28. The Stromlo-APM 
correlation function has been determined by Loveday et 
al. (1995) and they obtain a power law with r$ = 5.9 
ft -1 Mpc and 7 = 1.47. The la error bars were cal- 
culated using the 27 pairs of mock PSCz and Stromlo 
catalogues under the same weighting scheme. 

The three correlation functions are shown in figure 
0. The Stromlo correlation function is in general greater 



Figure 8: The cross correlation function, £12, for PSCz 
and Stromlo-APM and the quantity (£1^2 ) 1 ^ 2 where £1 
and £2 are the individual two point correlation func- 
tions. The error bars show one standard deviations cal- 
culated from the mock catalogues. 

than that of PSCz, consistent with the cell count vari- 
ances. As expected, the cross correlation function lies 
between the two individual functions. 

The cross-correlation function £12 and the quantity 
(^l^) 1 / 2 are shown in Figure [|. This shows that these 
two quantities are consistent within their error bars, i.e. 
in good agreement with the linear biasing hypothesis. 
If the galaxy distributions were related in some other 
way - for example non-linear or stochastic biasing - we 
would expect £12 to lie below (£162 )^ 2 - We see from 
Figure || that there is no such systematic difference. 
Whereas the counts in cells technique was only used to 
investigate structure on scales of £ > lOft^Mpc, here 
we see that the linear bias model is consistent right 
down to scales of ~ 1.5 h~ 1 Mpc. On scales smaller than 
this our determination of the three correlation functions 
becomes highly uncertain as there are a low number of 
galaxy pairs at these separations. 

The relative bias 6 rc i = &stromlo/&PSCz is estimated 
from the correlation functions at distances in the range 
1 - 20/i~ 1 Mpc using 



£strc 



Lll0 



£ 



PSCz 



1/2 



(22) 



and is plotted in Figure Beyond separations of r » 
30/i _1 Mpc the relative bias becomes highly uncertain 
as both correlation functions are consistent with zero on 
these scales. Where the la error bars are not too large, 
we see that the relative bias remains roughly constant 
over the separation range r < 20/i _1 Mpc and has a 
mean value of b xe ] = 1.29 ± 0.07, where the error is 
estimated from the mock catalogues. A calculation of 
X 2 = J2iLi 9>i — b) 2 /a 2 gives a reduced chi-squared of 
1.76. Although a little high, this value does not rule out 
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Figure 9: The relative bias between the Stromlo 
and PSCz correlation functions over the range 1 — 
20/i" 1 Mpc. The dotted line is the mean value 

frstromlo/frpSCz = 1-29 



the hypothesis that the relative bias is unchanging with 
scale within a 95% confidence limit. (Note that caution 
should be taken in the interpretation of this \ 2 value 
since the measurements of the values of b on different 
scales are not independent.) More importantly, we note 
that there is no obvious trend in the relative bias with 
scale. 

We estimate the average correlation coefficient R = 
Ci2/VCi^2 by taking an unweighted mean over data 
points in the range 2/i~ 1 Mpc < r < 20/i~ 1 Mpc; the 
result is R = 0.93 ± 0.06, where we again derive the la 
error from the simulations. This can be used to place 
limits on "stochastic bias" as discussed later. 



5 THE TEGMARK 'NULL-BUSTER' TEST 

Tegmark and Bromley (1998) use a generalised x 2 - 
statistic to directly compare cell counts of different 
galaxy populations in the Las Campanas Redshift Sur- 
vey. We have applied their method to compare the clus- 
tering in PSCz and Stromlo- APM. We bin the galaxies 
from the two surveys using the cells described earlier 
and, for each of N c cells, we calculate the overdensi- 
ties 5l (PSCz) = Af SCz )/(7V)( p scz) _ ! and 5 (stro m io) = 

1 (i = 1,...,JV C ). The ex- 



^y-(btromlo) y^jy^ (Stromlo 

pected counts (Njf 50 ^ and (jV)( Stroml °^ were calcu- 
lated using the respective selection functions and the 
joint survey mask. If the galaxy densities are related by 
linear bias, then there will exist a value for 6 rc i such 
that the (column) vector 



is consistent with shot noise. The covariance matrix of 
Ag is diagonal and (for density fluctuations £ 1) is 
given by 

1 b 2 

N = - h rcl . (24) 

(iV)strorolo (AO PSCz 

Our null hypothesis is that, for a given value of 6, 
(Ag) — and (AgAg 1 ) = N. We might choose to 
test this hypothesis by calculating the statistic x 2 = 
Ag t N~ 1 Ag. The number of "sigmas" at which the null 
hypothesis is ruled out is then v = (x 2 — N c ) /^/2N C . If 
we have an alternative hypothesis that there is an extra 
signal with covariance matrix S, such that (AgAg 1 ) = 
N + S, then Tegmark (1998) shows that the statistic 
A5*N" 1 SN~ 1 Ag will provide a more sensitive test (by 
using our prior knowledge of the signal covariance ma- 
trix). The significance at which the null hypothesis can 
be ruled out is then given by 

_ A^N^SN-^As - trN- x S 



ptrjN-iSN-iS}] 1 ^ 



(25) 



If the biasing is non-linear then the deviations from lin- 
earity would be correlated with large scale structure. We 
thus choose the matrix S to be the covariance between 
cell overdensities calculated using the PSCz correlation 
function, i.e. the volume-average of C{ r ij) over cells i 
and j. (Note that v depends only on the shape of S, 
not its amplitude). The resulting values of v as a func- 
tion of b for a range of cell sizes are shown in Figure 10. 

As expected we see that extreme values of the rela- 
tive bias parameter are ruled out with high significance 
(i.e., the clustering in each individual survey is inconsis- 
tent with pure shot noise) . The position of the minimum 
indicates the most likely value of b re \: Table || shows this 
value for each cell size. The la errors are estimated us- 
ing the 27 simulated catalogues from § IA. For all cell 



Ag = gst 



romlo 



&rcl#PSCz 



(23) 



sizes we see that the minimum value of v is typically 
not much larger than 1, again suggesting that the two 
surveys are consistent with a linear relative bias. At 
all scales the relative bias is consistent with & re i « 1.3, 
again above unity but slightly lower than the values es- 
timated in 53. 



6 CONCLUSIONS 

We have compared the clustering of two redshift sur- 
veys (the new IRAS-selected PSCz survey, and the 
optically-selected Stromlo-APM survey) within their 
common region of space. Three complementary meth- 
ods have been used: the counts-in-cells method, the 
two-point correlation function, and the Tegmark 'null- 
buster' test. In all three cases the results are consistent 
with a linear biasing model, i.e. (Jstromlo = b IC \Spscz 
with & rc i = &stromlo/&PSCz ~ 1>3 ± 0.1- There is lit- 
tle evidence for variation of the relative bias over the 
range of scales from ~ 5 — 30 /i _1 Mpc. We find a lower 
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Figure 10: The function v(b Te \), i.e. number of "sigmas" at which the linear bias <?stromio = ^rci5PSCz can be ruled 
out as a function of 6 re i. The position of the minimum is given for each cell size. 

Table 2: Relative bias as a function of cell size. 





15 


1.27 ± 0.11 


20 


1.22 ±0.09 


25 


1.27±0.12 


30 


1.30 ±0.16 


40 


1.65 ±0.13 



12 M.D. Seaborne et al. 



limit on the correlation coefficient R £ 0.85 on scales 
- 10- 20^ 1 Mpc. 

Our value for the relative bias is in quite good 
agreement with earlier estimates; Baker et al. (1998) 
found 6 re] = 1.4; Willmer et al. found 6 rc i = 1.20 ± 0.07 
and Saunders et al. (1992) obtained a value of 6 rc i = 
1.38 ± 0.12 in real space. Our high value for R seems 
somewhat unexpected in view of the results of Tegmark 
& Bromley (1998), who found values of R as low as 
0.5 comparing various pairs of spectral classes in the 
Las Campanas redshift survey. However, their results 
are not directly comparable with ours for several rea- 
sons: firstly, they used considerably smaller cells of size 
~ 6/i~ 1 Mpc, and secondly our galaxy classes (PSCz 
and Stromlo) do not correspond closely to the classes 
used by TB; Stromlo contains a mix of early and late 
type galaxies, while PSCz is weighted towards late-type 
spirals with a preference for high surface brightnesses 
(hence higher dust temperatures) and merging systems. 

We may translate a lower limit on R into an upper 
limit on 'stochastic' bias if it is independent between 
the two galaxy classes, as follows: using the notation of 
Dekel & Lahav (1998), we may define the stochasticity 

— 1, 2) by g-i = (gi\5) + q where the angle brackets 
denote averages over S. Then 



Cov (31,32) 
R 



= ((9i\S)(92\S)) 



£1^2 



a 2 (b 2 12 



'612 



) thus 



= Cov(g 1 ,g 2 )/<Tgi(Jg2 



biz 



'612 



'61 



^2) 



1/2 



(26) 
(27) 



where the first line follows since the cross terms 
((31 1 8) e 2 ) vanish , and the second line follows from the 
definitions of 612 and <7{,i2, analogous to the definitions 
b\ = (( 9l \S) 2 )/a 2 and <j 2 bl = (e 2 )/a 2 in DL, and a 2 
is the mass variance on a given smoothing scale. From 
numerical studies, DL find that b/b ~ 1 — 1.15, so it 
is probably reasonable to assume 612 ~ (&i6 2 ) 1/2 . Then 
if we assume <76i2 = (i.e. assuming that the 'stochas- 
tic' bias is independent between surveys), and adopting 
frpscz = l,&stromio = 1-3, we find that R > 0.8 would 
imply a 2 x < 0.56 and a 2 2 < 0.95, Likewise R > 0.9 
would imply a 2 x < 0.23 and a 2 2 < 0.39. DL show that 
b 2 m . — b + a 2 , so the "fractional stochasticity" is ef- 
fectively cr 2 /b, i.e. the ratio of stochastic variance to 
deterministic variance in galaxy density. If the frac- 
tional stochasticity is equal for the two surveys, such 
that u 2 i/b\ — iy 2 2 /b\ — y, then the above assumptions 
lead to 1 + y = R~ x . Clearly, these numbers are illus- 
trative rather than definitive: the assumption a 2 12 = 
may not be satisfied in practice since there could be 
a 'second parameter' in addition to the mass overden- 
sity e.g. local shear, gas temperature etc. which affects 
the formation of both IRAS and optical galaxies. But 
this argument does suggest that 'stochastic' bias which 



preferentially affects either the earliest or latest type 
galaxies is unlikely to be severe. 

Our results are encouraging for velocity field stud- 
ies, in that that they suggest that large-scale density 
fields (usually estimated from IRAS galaxies) should 
also be a good match to those which would be estimated 
from all-sky optical surveys (currently rather limited in 
extent); this supports the conclusions of Baker et al. 
(1998). 

In a future paper we plan to apply the tests con- 
sidered here to the PSCz and the Optical Redshift Sur- 
vey (Santiago et al. 1995), which gives a higher sam- 
pling density than Stromlo- APM and thus may provide 
stronger constraints. 
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